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Abstract 

We propose a novel method to quantify the clustering behavior in a complex time series and 
apply it to a high-frequency data of the financial markets. We find that regardless of used data 
sets, all data exhibits the volatility clustering properties, whereas those which filtered the volatility 
clustering effect by using the GARCH model reduce volatility clustering significantly. The result 
confirms that our method can measure the volatility clustering effect in financial market. 
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I. INTRODUCTION 



Recently, financial markets have been known as representatives of complex system, which 
changes the property of system dynamically according to inflows of various information 
from outside and interactions between heterogenous agents [1]. In order to understand the 
complexity of financial market, the methods of interdisciplinary research have been achieving 
in physics and economics fields. The various stylized facts such as the long-term memory 
of volatility [2], volatility clustering [3], fat tails [4], multifractality [5, 6] are observed. 
The obvious properties among the stylized facts are the long-term memory property and 
clustering effects of the volatility data. In previous studies, clustering behaviors are shown 
in the return time interval statistics of the climate records [7] , medical data, extreme floods 
[8], and economics [9]. 

The various models which reflects the volatility clustering effect in order to predict ex- 
actly the volatility in the econometrics field are introduced. The autoregressive conditional 
heteroskedasticity (ARCH) [10] and the generalized autoregressive conditional heteroskedas- 
ticity (GARCH) model [11] are the representatives. Namely, the many researches to under- 
stand the micro-mechanism of market has been processed. However, the study to quantify 
the volatility clustering effects is not sufficient yet. If we observe quantitatively the volatility 
clustering effect in financial markets, we will understand micro phenomena of the market. 
In this paper, we propose the novel method to quantify the volatility clustering effect in the 
financial time series. 

We find that all data sets analyzed in this paper exhibit the volatility clustering property, 
whereas the data which filters the volatility clustering effect by the GARCH(1,1) model 
reduces the degree of volatility clustering significantly. 

In the next section, we describe the data sets and methods used in this paper. In section 
llllt we preset our results of this study. Section [IV] concludes. 

II. DATA AND METHODS 
A. DATA 

We investigate quantitatively the volatility clustering behaviors using financial time series 
including the following market data sets: the 5 minute S&P 500 index from 1995 to 2004 
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and the 5 minute 28 individual stocks traded in the NYSE with the largest hquidity from 
1993 to 2002. The return time series r{t) is calculated by the log-difference of high-frequency 
prices as follows: r{t) = InP(t) — In P{t — 1) where P{t) represents the stock price at time 
t. 

B. Method to quantify the volatility clustering 

In this subsection, we propose a novel method to quantify the volatility clustering effect. 
We estimate and analyze quantitatively the volatility clustering effect existed in the financial 
time series. The process is explained by the following. 

Step 1 (The symbolized process): We transfer the return time series r{t) to the symbolic 
data s{t) in order to quantify the volatility clustering effect in the financial data using the 
control parameter, such as the number of bins which is defined as 



where A^;, is the number of bins. The conditional distribution with statistical significance is 
calculated by the symbolized process. 

Step 2 (Calculating the conditional distribution): We estimate the conditional distribu- 
tion using the symbolized time series generated in step 1. In other words, the conditional 
distribution corresponds to the next value of a specific symbol St in the symbolic data. 
Next, we calculate repeatedly the conditional distribution P^SjISt) for all symbolic data 
in the proper regime. The conditional distribution of each symbolic data has a non-trivial 
property like the conditional value St if there is a volatility clustering behavior. 

Step 3 (The average value of conditional distribution): The step 3 is the calculation of the 
average value of the conditional distribution estimated in the step 2. We only consider the 
conditional distribution of symbolic data in the proper range because the extreme symbolic 
data are rare. By the average value of the conditional distribution, we observe the volatility 
clustering effect defined as 



where Nt is the element numbers of the conditional distribution P{Sj\ST) in terms of a 



{S{t) = T, 



if r{t)eT,}, 71 = {Ti,T2,--- ,TArJ 



(1) 




(2) 
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specific symbol St- If the average value is not dependent on the symbolic value St, there 
is no volatility clustering effect because the time series shows the volatihty clustering effect 
only when it has a positive (negative) relation with the positive (negative) values of St- 
Next, we calculate the relation between the specific symbolic values St and average values 
St of conditional distribution. In other words, we observe the degree of volatility clustering 
(DVC) behavior according to the relationship between St and St- The average value St is 
definded as 



where DVC^'^ is the degree of volatility clustering effect for the positive and negative cases 
respectively. When DVC^'^ = 0, there is no clustering effect. However, when the value 
of DVC^'^ is nonzero, the degree of volatihty clustering effect according to the relative 
magnitude of DVC^'^ is measured. Therefore, we can estimate quantitatively the volatihty 
clustering effect of the financial time series. 

III. RESULTS 

In this section, wc present the volatility clustering effect of the financial time scries. In 
order to verify usefulness of the method proposed in this paper, we employ the GARCH(1,1) 
which reflects the volatihty clustering effect. 

First of all, we apply the novel method to the 5 minute S&P500 index and calculate the 
degree of volatility clustering. Fig. la represents the return time series of the 5 minute 
S&P500 index and Fig. lb shows its symbolic time series. We then calculate the condi- 
tional distribution P{S\St) of a specific symbol data St- Fig. Ic shows the conditional 
distributions of specific symbols. In Fig. Ic, we find that the width of the conditional 
distribution increases as the value of symbolic data increases. In other words, the width of 
the conditional distribution of small symbolic data is relatively narrow than that of large 
symbolic data. The average value for the conditional distribution regarding specific symbolic 
data St is calculated in order to observe the relationship between specific symbolic values 
and its conditional distribution. Fig. Id shows the relationship between specific symbol 
and average value. Circles and squares of Fig. Id indicate the original and the surrogate 
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time series respectively. We find that the average values of the conditional distribution for 
the original time series are positively related to the magnitude of specific symbolic value, 
DVCg^p^QQ — 0.57 and -DFC^p^qq = 0.52, while those for the shuffled time series is not 
dependent on the symbolic value St- The return time series of the S&P500 index shows the 
volatility clustering effect, the larger (small) values follow the larger (small) values. 

Next, we utihze the GARCH model to verify the usefulness of our method. The GARCH 
model generates the volatility clustering effect. We create the new time series with the 
volatihty clustering effect removed by the GARCH(1,1) filtering model and estimate the 
degree of the volatihty clustering effect. Fig. 2 displays the degree of the volatihty clustering 
for the 28 individual stocks traded in the NYSE stock market with the largest liquidity. The 
circles (red), the diamonds (blue), the squares (green), and the triangles (pink) indicate 
the degree of the volatility clustering for the positive and negative return time series using 
the original and the GARCH(1,1) filtering data, respectively. In Fig. 2, we find that all 
the stock return time series, regardless of individual stocks, have the volatility clustering 
effect, 0.38 < DVC^ < 0.72 and -0.69 < DVC^ < -0.32. However, after ehminating 
the volatility clustering behavior by the GARCH (1,1) model, the degree of the volatility 
clustering effect is reduced significantly. This supports that our method to quantify the 
volatility clustering effect in financial time series is working well. 



IV. CONCLUSIONS 



We proposed the novel method to quantify the volatility clustering behavior in financial 
time series and calculated the degree of the volatility clustering (DVC) using the diverse 
stock prices. First, we found that all financial data analyzed exhibited the volatility clus- 
tering properties, whereas those which are filtered the volatility clustering effect by the 
GARCH(1,1) model reduced the degree of the volatility clustering effect significantly. This 
result confirmed that our method calculated the volatility clustering effect in financial time 
series well. Our method might be applied to elaborate clustering analysis of diverse com- 
plex signals including climate, HRV as well as financial time series. Further studies on the 
volatility clustering will examine to the above systems more extensively. 

This work was supported by the Korea Research Foundation funded by the Korean Gov- 
ernment (MOEHRD) (KRF-2005-042-B00075), and the MOST/KOSEF to the National 



5 



Core Research Center for Systems Bio-Dynamics (R15-2004-033), and by the Ministry of 
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istry of Education through the program BK 21. 
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FIG. 1: (a) The return time series of the 5 minute S&P500 index for 9 years from 1995 to 2004. (b) 
Its symbohcs time series, (c) The conditional distributions of specific symboHc values. Each emblem 
indicate the specific symbolic value, (d) The degree of the volatility clustering of the S&PSOO index. 
The circles and the squares indicate the shuffled and the original time series respectively. 
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FIG. 2: The degree of the volatihty clustering for the 28 individual stocks traded in the NYSE 
from 1993 to 2002 and its GARCH filtering data. The circles (red), the diamonds (blue), the 
squares (green), and the triangles (pink) indicate the positive and the negative data of original and 
GARCH(1,1) filtering data respectively. 
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